Hands-on Exercise 3: Spatial Interaction Models

Published

November 30, 2023

Modified

December 7, 2023

1. Overview

Spatial interaction represent the flow of people, material, or information between locations in geographical space. It encompasses everything from freight shipments, energy flows, and the global trade in rare antiquities, to flight schedules, rush hour woes, and pedestrian foot traffic.

Each spatial interaction, as an analogy for a set of movements, is composed of a discrete origin/destination pair. Each pair can be represented as a cell in a matrix where rows are related to the locations of origin and columns are related to locations of destination. Such a matrix is commonly known as an origin/destination matrix or a spatial interaction matrix.

In this hands-on exercise, we will build an OD matrix by using Passenger Volume by Origin Destination Bus Stops data set downloaded from LTA DataMall.

2. Getting Started

We will use the following R packages in this exercise:

  • sf for importing, integrating, processing and transforming geospatial data

  • tidyverse for importing, integrating, wrangling and visualising data

  • tmap for creating thematic maps

pacman::p_load(tmap, sf, DT, stplanr, 
               performance, ggpubr, tidyverse)

3. Preparing the Flow Data

3.1. Importing Origin Destination data

First, we will import the Passenger Volume by Origin Destination Bus Stops data set download from LTA DataMall:

odbus <- read_csv("data/aspatial/origin_destination_bus_202310.csv")
glimpse(odbus)
Rows: 5,694,297
Columns: 7
$ YEAR_MONTH          <chr> "2023-10", "2023-10", "2023-10", "2023-10", "2023-…
$ DAY_TYPE            <chr> "WEEKENDS/HOLIDAY", "WEEKDAY", "WEEKENDS/HOLIDAY",…
$ TIME_PER_HOUR       <dbl> 16, 16, 14, 14, 17, 17, 17, 7, 14, 14, 10, 20, 20,…
$ PT_TYPE             <chr> "BUS", "BUS", "BUS", "BUS", "BUS", "BUS", "BUS", "…
$ ORIGIN_PT_CODE      <chr> "04168", "04168", "80119", "80119", "44069", "2028…
$ DESTINATION_PT_CODE <chr> "10051", "10051", "90079", "90079", "17229", "2014…
$ TOTAL_TRIPS         <dbl> 3, 5, 3, 5, 4, 1, 24, 2, 1, 7, 3, 2, 5, 1, 1, 1, 1…

It is observed that ORIGIN_PT_CODE and DESTINATION_PT_CODE are both recognised as integers. Since these are bus stop identifiers, they should be converted to factor type using the following code:

odbus$ORIGIN_PT_CODE <-
  as.factor(odbus$ORIGIN_PT_CODE) 

odbus$DESTINATION_PT_CODE <-
  as.factor(odbus$DESTINATION_PT_CODE)

glimpse(odbus)
Rows: 5,694,297
Columns: 7
$ YEAR_MONTH          <chr> "2023-10", "2023-10", "2023-10", "2023-10", "2023-…
$ DAY_TYPE            <chr> "WEEKENDS/HOLIDAY", "WEEKDAY", "WEEKENDS/HOLIDAY",…
$ TIME_PER_HOUR       <dbl> 16, 16, 14, 14, 17, 17, 17, 7, 14, 14, 10, 20, 20,…
$ PT_TYPE             <chr> "BUS", "BUS", "BUS", "BUS", "BUS", "BUS", "BUS", "…
$ ORIGIN_PT_CODE      <fct> 04168, 04168, 80119, 80119, 44069, 20281, 20281, 1…
$ DESTINATION_PT_CODE <fct> 10051, 10051, 90079, 90079, 17229, 20141, 20141, 1…
$ TOTAL_TRIPS         <dbl> 3, 5, 3, 5, 4, 1, 24, 2, 1, 7, 3, 2, 5, 1, 1, 1, 1…

3.2. Extracting Study Data

We will extract commuting flows on weekdays between 6am and 9am.

odbus6_9 <- odbus %>%
  filter(DAY_TYPE == 'WEEKDAY') %>%
  filter(TIME_PER_HOUR >= 6 &
           TIME_PER_HOUR <= 9) %>%
  group_by(ORIGIN_PT_CODE, DESTINATION_PT_CODE) %>%
  summarise(TRIPS = sum(TOTAL_TRIPS))

datatable(odbus6_9)

Save the output in rds format for future use:

write_rds(odbus6_9, "data/rds/odbus6_9.rds")

3.3. Importing Geospatial Data

The following geospatial data will be used in this exercise:

  • BusStop: Provides location of bus stops

  • MPSZ-2019: Provides sub-zone bondaries based on the URA Master Plan 2019

We will use the following codes to import the two data sets into R:

busstop <- st_read(dsn = "data/geospatial",
                   layer = "BusStop") %>%
  st_transform(crs = 3414)
Reading layer `BusStop' from data source 
  `C:\Users\noel1\Documents\School\02. Special Sem 1\ISSS624 Geospatial Analysis\noelnomel\ISSS624\Hands_on_Ex\Hands_on_Ex03\data\geospatial' 
  using driver `ESRI Shapefile'
Simple feature collection with 5161 features and 3 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 3970.122 ymin: 26482.1 xmax: 48284.56 ymax: 52983.82
Projected CRS: SVY21
busstop
Simple feature collection with 5161 features and 3 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 3970.122 ymin: 26482.1 xmax: 48284.56 ymax: 52983.82
Projected CRS: SVY21 / Singapore TM
First 10 features:
   BUS_STOP_N BUS_ROOF_N             LOC_DESC                  geometry
1       22069        B06   OPP CEVA LOGISTICS POINT (13576.31 32883.65)
2       32071        B23         AFT TRACK 13 POINT (13228.59 44206.38)
3       44331        B01              BLK 239  POINT (21045.1 40242.08)
4       96081        B05 GRACE INDEPENDENT CH POINT (41603.76 35413.11)
5       11561        B05              BLK 166 POINT (24568.74 30391.85)
6       66191        B03         AFT CORFE PL POINT (30951.58 38079.61)
7       23389       B02A              PEC LTD   POINT (12476.9 32211.6)
8       54411        B02              BLK 527 POINT (30329.45 39373.92)
9       28531        B09              BLK 536 POINT (14993.31 36905.61)
10      96139        B01              BLK 148  POINT (41642.81 36513.9)
mpsz2019 <- st_read(dsn = "data/geospatial",
                    layer = "MPSZ-2019") %>%
  st_transform(crs = 3414)
Reading layer `MPSZ-2019' from data source 
  `C:\Users\noel1\Documents\School\02. Special Sem 1\ISSS624 Geospatial Analysis\noelnomel\ISSS624\Hands_on_Ex\Hands_on_Ex03\data\geospatial' 
  using driver `ESRI Shapefile'
Simple feature collection with 332 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 103.6057 ymin: 1.158699 xmax: 104.0885 ymax: 1.470775
Geodetic CRS:  WGS 84
mpsz2019
Simple feature collection with 332 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
Projected CRS: SVY21 / Singapore TM
First 10 features:
                 SUBZONE_N SUBZONE_C       PLN_AREA_N PLN_AREA_C       REGION_N
1              MARINA EAST    MESZ01      MARINA EAST         ME CENTRAL REGION
2         INSTITUTION HILL    RVSZ05     RIVER VALLEY         RV CENTRAL REGION
3           ROBERTSON QUAY    SRSZ01  SINGAPORE RIVER         SR CENTRAL REGION
4  JURONG ISLAND AND BUKOM    WISZ01  WESTERN ISLANDS         WI    WEST REGION
5             FORT CANNING    MUSZ02           MUSEUM         MU CENTRAL REGION
6         MARINA EAST (MP)    MPSZ05    MARINE PARADE         MP CENTRAL REGION
7                   SUDONG    WISZ03  WESTERN ISLANDS         WI    WEST REGION
8                  SEMAKAU    WISZ02  WESTERN ISLANDS         WI    WEST REGION
9           SOUTHERN GROUP    SISZ02 SOUTHERN ISLANDS         SI CENTRAL REGION
10                 SENTOSA    SISZ01 SOUTHERN ISLANDS         SI CENTRAL REGION
   REGION_C                       geometry
1        CR MULTIPOLYGON (((33222.98 29...
2        CR MULTIPOLYGON (((28481.45 30...
3        CR MULTIPOLYGON (((28087.34 30...
4        WR MULTIPOLYGON (((14557.7 304...
5        CR MULTIPOLYGON (((29542.53 31...
6        CR MULTIPOLYGON (((35279.55 30...
7        WR MULTIPOLYGON (((15772.59 21...
8        WR MULTIPOLYGON (((19843.41 21...
9        CR MULTIPOLYGON (((30870.53 22...
10       CR MULTIPOLYGON (((26879.04 26...

Save the output in rds format for future use:

write_rds(mpsz2019, "data/rds/mpsz2019.rds")

4. Geospatial Data Wrangling

4.1. Combining busstop and mpsz2019

We will use the following code to populate the planning subzone code from mpsz2019 sf dataframe to busstop sf dataframe:

bs_mpsz2019 <- st_intersection(busstop, mpsz2019) %>%
  select(BUS_STOP_N, SUBZONE_C) %>%
  st_drop_geometry()

datatable(bs_mpsz2019)

st_intersection() is used to perform point and polygon overlay, output is in point sf object. Five bus stops are dropped from the results as they are outside Singapore’s boundaries.

Save the output in rds format for future use:

write_rds(bs_mpsz2019, "data/rds/bs_mpsz2019.rds")

Next, we will append the planning subzone code from bs_mpsz2019 onto the odbus6_9 dataframe:

od_data <- left_join(odbus6_9, bs_mpsz2019,
                     by = c("ORIGIN_PT_CODE" = "BUS_STOP_N")) %>%
  rename(ORIGIN_BS = ORIGIN_PT_CODE,
         ORIGIN_SZ = SUBZONE_C,
         DESTIN_BS = DESTINATION_PT_CODE)

Check for duplicates using the following code:

duplicates <- od_data %>%
  group_by_all() %>%
  filter(n() > 1) %>%
  ungroup

glimpse(duplicates)
Rows: 1,186
Columns: 4
$ ORIGIN_BS <chr> "11009", "11009", "11009", "11009", "11009", "11009", "11009…
$ DESTIN_BS <fct> 01341, 01341, 01411, 01411, 01421, 01421, 01511, 01511, 0152…
$ TRIPS     <dbl> 1, 1, 4, 4, 17, 17, 19, 19, 2, 2, 1, 1, 11, 11, 24, 24, 1, 1…
$ ORIGIN_SZ <chr> "QTSZ01", "QTSZ01", "QTSZ01", "QTSZ01", "QTSZ01", "QTSZ01", …

Duplicate records are found in the dataframe. We will use the following code to retain the unique records:

od_data <- unique(od_data)

Next, we update the od_data dataframe with the planning subzone codes:

od_data <- left_join(od_data, bs_mpsz2019,
                     by = c("DESTIN_BS" = "BUS_STOP_N"))
duplicates <- od_data %>%
  group_by_all() %>%
  filter(n() > 1) %>%
  ungroup

glimpse(duplicates)
Rows: 1,350
Columns: 5
$ ORIGIN_BS <chr> "01013", "01013", "01112", "01112", "01112", "01112", "01121…
$ DESTIN_BS <chr> "51071", "51071", "51071", "51071", "53041", "53041", "51071…
$ TRIPS     <dbl> 2, 2, 66, 66, 4, 4, 8, 8, 1, 1, 1, 1, 24, 24, 6, 6, 4, 4, 21…
$ ORIGIN_SZ <chr> "RCSZ10", "RCSZ10", "RCSZ10", "RCSZ10", "RCSZ10", "RCSZ10", …
$ SUBZONE_C <chr> "CCSZ01", "CCSZ01", "CCSZ01", "CCSZ01", "BSSZ01", "BSSZ01", …
od_data <- unique(od_data)
od_data <- od_data %>%
  rename(DESTIN_SZ = SUBZONE_C) %>%
  drop_na() %>%
  group_by(ORIGIN_SZ, DESTIN_SZ) %>%
  summarise(AM_PEAK = sum(TRIPS))

Save the output in rds format for future use:

write_rds(od_data, "data/rds/od_data.rds")

5. Visualising Spatial Interaction

We will prepare a desired line using the stplanr package.

5.1. Removing Intra-zonal Flows

We will use the following code to remove intra-zonal flows:

od_data1 <- od_data[od_data$ORIGIN_SZ != od_data$DESTIN_SZ,]

5.2. Creating Desire Lines

We will use od2line() from the stplanr package to create desire lines:

flow_line <- od2line(flow = od_data1,
                     zones = mpsz2019,
                     zone_code = "SUBZONE_C")

5.3. Visualising Desire Lines

We use the following code to visualise the desire lines:

tm_shape(mpsz2019) +
  tm_polygons() +
flow_line%>%
  tm_shape() +
  tm_lines(lwd = "AM_PEAK",
           style = "quantile",
           scale = c(0.1,1,3,5,7,10),
           n = 6,
           alpha = 0.3)

When the data flow is very messy and highly skewed, focusing on selected flows may give more clarity. For example, we can focus on flow greater than or equal to 5000:

tm_shape(mpsz2019) +
  tm_polygons() +
flow_line %>%
  filter(AM_PEAK >= 5000) %>%
tm_shape() +
  tm_lines(lwd = "AM_PEAK",
           style = "quantile",
           scale = c(0.1,1,3,5,7,10),
           n = 6,
           alpha = 0.3)